Rank in Wordlist | Frequency | Word |
---|---|---|
67925 | 286 | website,the |
71978 | 258 | x,y |
91130 | 170 | holiday,Apartments |
110091 | 120 | X,Y |
110394 | 120 | Wigginton,Yorkshire,England |
111416 | 118 | Rolls,E.T |
120179 | 103 | B,L,D |
127689 | 93 | A,B |
131042 | 88 | A,B,C |
131528 | 88 | I,m |
Rank in Wordlist | Frequency | Word |
---|---|---|
32287 | 1043 | Hour(s |
43823 | 621 | Java(tm |
44618 | 603 | A1(M |
44623 | 603 | BA(Hons |
44957 | 595 | name(s |
46853 | 553 | item(s |
47890 | 532 | person(s |
49596 | 500 | BSc(Hons |
50167 | 491 | Adobe(R |
52170 | 458 | author(s |
Rank in Wordlist | Frequency | Word |
---|---|---|
45503 | 583 | C)2002 |
53884 | 432 | C)2001 |
54421 | 425 | C)1961-2002 |
62987 | 328 | C)2000 |
63063 | 327 | s)he |
74763 | 241 | prolog):-1 |
86931 | 185 | C)1999 |
90001 | 173 | C)Copyright |
90408 | 172 | C)1998 |
96984 | 152 | C)1999/2000 |
Rank in Wordlist | Frequency | Word |
---|---|---|
354562 | 16 | `%s |
388849 | 14 | 12%vol |
399525 | 14 | TEN%LESS |
450332 | 11 | 12.5%vol |
477556 | 10 | 50%RH |
479606 | 10 | 50%of |
480850 | 10 | 1%pa |
523032 | 9 | 25%of |
531626 | 9 | 80%-ile |
553198 | 8 | 80%of |
Rank in Wordlist | Frequency | Word |
---|---|---|
9907 | 6892 | R&D |
11416 | 5529 | B&B |
15856 | 3314 | W&M |
21881 | 1990 | P&O |
22145 | 1951 | A&E |
22405 | 1915 | p&p |
22546 | 1894 | R&B |
23103 | 1824 | P&P |
24503 | 1658 | don't |
26644 | 1442 | it's |
Rank in Wordlist | Frequency | Word |
---|---|---|
96500 | 153 | US$1 |
101629 | 139 | US$100 |
112619 | 116 | US$10 |
118148 | 106 | US$20 |
124959 | 96 | US$30 |
133494 | 85 | Ru$hden |
135439 | 83 | US$50 |
144703 | 74 | US$5 |
146169 | 73 | US$200 |
146193 | 73 | Micro$oft |
Rank in Wordlist | Frequency | Word |
---|---|---|
62704 | 331 | translations+PDB+SwissProt+SPupdate+PIR |
63764 | 321 | GenBank+EMBL+DDBJ+PDB |
81369 | 208 | P+P |
86106 | 188 | p+p |
95921 | 155 | B+B |
103445 | 135 | HUBER+SUHNER |
104534 | 132 | n+1 |
107622 | 125 | VG+/VG |
109797 | 121 | DVD+RW |
109819 | 121 | b+w |
Rank in Wordlist | Frequency | Word |
---|---|---|
1910 | 55376 | and/or |
7457 | 10266 | his/her |
10473 | 6322 | Rule/Error |
10477 | 6317 | he/she |
11706 | 5324 | I/O |
11821 | 5246 | TCP/IP |
14810 | 3683 | HIV/AIDS |
17210 | 2915 | tea/coffee |
18067 | 2688 | b/w |
18127 | 2673 | him/her |
Rank in Wordlist | Frequency | Word |
---|---|---|
70212 | 270 | =========== |
119581 | 104 | ==== |
128396 | 92 | ========================================================= |
128906 | 91 | DG=Double |
148845 | 71 | ====== |
151718 | 68 | True=Yes,False=No |
152304 | 68 | C=O |
163770 | 60 | == |
169184 | 56 | X=B |
185687 | 48 | n=2 |
Rank in Wordlist | Frequency | Word |
---|---|---|
35551 | 889 | tmpseq_1 |
54325 | 426 | __ |
57121 | 389 | WS_FTP |
80590 | 211 | public_html |
117615 | 107 | mod_ssl |
118228 | 106 | mod_rewrite |
124178 | 97 | mod_proxy |
137597 | 81 | TCL_ERROR |
138769 | 80 | Back_to_Newsletter_Index |
139949 | 79 | TCL_OK |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots